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Abstract 

Despite claims that Bell's inequalities are based on the Einstein locality con- 
dition, or equivalent, all derivations make an identical mathematical assumption: 
that local hidden-variable theories produce a set of positive-definite probabilities for 
detecting a particle with a given spin orientation. The standard argument is that 
because quantum mechanics assumes that particles are emitted in a superposition 
of states the theory cannot produce such a set of probabilities. We examine a paper 
by Eberhard, and several similar papers, which claim to show that a generalized 
Bell inequality, the CHSH inequality, can be derived solely on the basis of the lo- 
cality condition, without recourse to hidden variables. We point out that these 
authors nonetheless assumes a set of positive-definite probabilities, which supports 
the claim that hidden variables or "locality" is not at issue here, positive-definite 
probabilities are. We demonstrate that quantum mechanics does predict a set of 
probabilities that violate the CHSH inequality; however these probabilities are not 
positive-definite. Nevertheless, they are physically meaningful in that they give the 
usual quantum-mechanical predictions in physical situations. We discuss in what 
sense our results are related to the Wigner distribution. 

PACS: 03.65-w,03.65.Bz 
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1 Introduction 

With the introduction of his celebrated inequalities in 1964, John Bell [0J provided the ba- 
sis for an experimental test to distinguish quantum mechanics from local hidden-variable 
theories. Since that time the universal interpretation of the results has been that quan- 
tum mechanics violates Bell's inequalities due to its "nonlocal" character, whereas local 
hidden variable theories satisfy the inequalities because, as their name implies, they are 
"local." 
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The situation is actually not so transparent. Bohr taught us to be aware of ambiguous 
language. Although derivations of Bell's inequalities are evidently based on Einstein's 
"locality" condition, couched in various phrases such as "principle of separability" and so 
forth, mathematically all derivations make an identical assumption, specifically: hidden- 
variable theories introduce a set of a priori positive- definite probabilities P that are not 
predicted by quantum mechanics. In Bohm's classic version of the Einstein-Podolsky- 
Rosen experiment, for example, a particle in a spin-singlet state decays into two daughter 
particles with zero total angular momentum (see, e.g., Sakurai's text [[| or Sudarshan 
and Rothman ||, henceforth SR). According to local hidden- variable theories there is 
an a priori positive-definite probability that the daughter particles will be detected with 
spins "up" along a chosen axis. Quantum mechanics, on the other hand, assumes that 
the daughter particles are in a superposition of states and so, by definition, there can be 
no a priori probability P such that their spins will be detected along a given direction. 

Contrary to this view, in SR we pointed out that quantum mechanics does predict 
a set of a priori probabilities, in exactly the same way as do hidden-variable theories, 
but the quantum probabilities are not positive-definite. They are nevertheless meaningful 
in that when applied to physical situations they give the standard quantum-mechanical 
answers, in particular the usual violation of Bell's inequalities. Given the exact analogy 
in producing the two sets of probabilities the distinction between "local" hidden-variable 
theories and "non-local" quantum mechanics is dissolved. From this point of view one 
merely has two competing theories that give two different sets of probabilities; it is unsur- 
prising that hidden- variables theories fail experimental tests of Bell's inequalities because 
they used the wrong set of probabilities for a quantum-mechanical problem. 

The notion of "extended" probabilities dates back to Dirac and we have not been the 
only authors to suggest that they can resolve the EPR paradox (see [|J ^|) but, needless 
to say, the SR argument has not found widespread acceptance. Recently, several rather 
old papers, in particular one by Eberhard |J entitled "Bell's Theorem Without Hidden 
Variables," have come to our attention. Eberhard's paper is of interest because it claims 
to show that a more general version of Bell's inequalities, known as the CHSH inequality 
(after Clauser, Home, Shimony and Holt) @, is violated by quantum mechanics, and that 
the CHSH inequality can be demonstrated solely on the basis of the locality principle, 
without the introduction of hidden variables. (A slightly later paper by Peres || gives an 
almost identical argument; one by Stapp B is in some respects similar.) At first sight 
these proofs appear to assume little more than 2 < 2V2. On closer inspection, however, 
we find that they "play into our hands," i.e., they may not make an explicit statement 
about hidden variables but they do assume a set of positive-definite probabilities. We now 
demonstrate this is so, reinforcing the contention in SR that, despite any words employed, 
the crucial mathematical assumption in derivations of Bell's inequalities is not locality but 
positive probability. 

2 The Eberhard Argument 

Eberhard considers two identical apparata, A and B, at two different locations. On 
apparatus A is a knob a that can be turned to two positions, 1 and 2. On apparatus B is 
a knob b that can also be turned to two positions, 1 and 2. With its knob at either position 
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apparatus A can record a series of events. It is not important exactly what the events are, 
but we assume that for each event each apparatus can measure only one of two possible 
outcomes, which for simplicity we take to be ±1. When the knob a is in the 1 position, we 
designate the outcome of the jth event as a±j, with similar notation for position 2 and knob 
b. For each event we can thus in principle have: ccy = ±1, a 2 j = ±1, Pij = ±1, (3 2 j = ±1. 
However, for each measurement we will choose only one setting on each apparatus, so a 
given event will produce a pair of readings, such as a± — 1, (3 2 — — 1- (Here and below we 
suppress the subscript j when it will not cause confusion.) 

For a series of iV measurements Eberhard next defines a quantity C, such that 

1 N 

C=^E^ (2-1) 

We see that C =< ctjfij >, the statistical mean of the N products ctjPj. No restriction 
is placed on the fraction of the N measurements for which the a's and /3's come out positive 
or negative, but note that each product aijPj = 1 when a and (3 have the same sign and 
ajj3j = — 1 when they have opposite signs. Thus C represents the fraction of events in 
which a and (3 have the same sign minus the fraction in which they have opposite sign. 

Because each knob has two positions, there are four possible versions of C. That is, 
we can define 

Cn = < a±Pi > 
C12 = < otifo > 
C 2 i = < a 2 (3\ > 

C 22 = <a 2 f3 2 > (2.2) 

(sum on j understood). Here, Cn is just the above statistical mean when knobs a and b 
are both in position 1, and so forth. 
Now, for each event let 

7 = aiPi + ai/3 2 + a 2 Pi - a 2 (3 2 . (2.3) 
Then, the statistical mean of 7 is just 

1 N 

<7> = ^£7; 

1 N 

= T7 XK^iA + a i/52 + «2/3i - a 2 /3 2 ) 
iV 3=1 

= Cn + C12 + C 2 \ — C22, (2.4) 

where in the second line we have again suppressed j. 

The locality condition enters the discussion when we attempt to put bounds on < 7 >. 
Recall that a knob will be set to either position 1 or 2 for each measurement. We assume 
that a measurement on A is independent of a measurement on B. The a's and /3's are 
thus treated independently. This is the locality condition. 

At this point a digression is necessary. Eberhard states that only one setting of each 
knob (position 1 or 2) will be used for each measurement, and that thus only one a or j3 is 
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recorded for each event. However, if this were indeed the case, then for each measurement 
only one term in 7 would survive (one product a/3) and the upper bound on 7 would be 
1 (cf. Eqs. (|2.3|) and (|2.7|) ). That the upper bound is 2 shows that mathematically all 
four possible terms ct(3 are present in 7. Consequently, not only are the a's being taken 
to be independent of the /3's but ai (Pi) is being treated as independent of a 2 (Z^). The 
rationale for including all a's and /3's in 7 simultaneously comes from a 1971 suggestion of 
Stapp |l0| . Stapp, Eberhard (and Peres || in his nearly identical thought experiment), are 
actually considering all possible outcomes of the measurements in a hypothetical ensemble 
space. By doing so they intend to show that any conceivable outcome of the experiment 
is violated by quantum mechanics. 

One can take several attitudes toward such a procedure. A first possible attitude 
is that it is illegitimate to speculate about the results of unperformed experiments. In 
other words, if one takes the quantity 7 literally, the knobs must be set to two positions 
at once, a physical impossibility. A second view is that it is indeed legitimate to think 
about all possible outcomes of an experiment^] and that if one does so, one is forced to the 
conclusion that quantum mechanics is nonlocal. In fact, there is a third possible viewpoint. 
As we discuss below, the 7's are derivable from the "master probabilities" employed in a 
standard derivation of Bell's inequalities, quantities that are not directly measurable but 
nevertheless have physical consequences. Hence both the Eberhard procedure and the 
standard derivation suffer from exactly the same ambiguities. For the moment it is not 
important which philosophy one adopts; we merely treat 7 as a mathematical quantity, 
as Eberhard does. At the same time, however, we see that by treating all the a's and 
P's as independent, mathematically the locality condition becomes indistinguishable from 
the general assumption of independent variables. 

In any case, following Eberhard we assume 16 possible values for each 7. At this 
stage of the exposition, Eberhard goes through an elaborate argument to show that 7 < 2 
always. However, let us redistribute the terms in Eq. (|2.3|) and write 

l = ai(Pi+p 2 ) + a 2 (Pi-P 2 ). (2.5) 

Because Pi and p 2 are equal or of opposite sign, if the first term is nonzero, the second 
term is zero and vice versa. Thus we can see trivially that 7 = ±2 always and I7I = 2, 
period. 

But by the triangle inequality we know that 

JV j N 

\ — *^2(aiPi + aiP 2 + a 2 Pi - a 2 p 2 )\ < — ^ \ (aiPi + a x p 2 + a 2 Pi - a 2 p 2 )\ (2.6) 
iV i=i iV 3=1 

Yet from Eq. (|2.4| ) and Eq. ( |2.3| ) this is by definition 

1 N 

\Cn + Ci 2 + C 2 i-C 22 \ < — X) iTil 

iV 3=1 

= ^JVx2 (2.7) 

The CHSH inequality follows immediately: 

\C\x + C12 + C 2 i — C 22 \ < 2, (2-8) 

1 This concept is often referred to as "counterfactual definiteness," after Stapp. 
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or, in more compact notation, 

\C\ < 2. (2.9) 

Eberhard next considers a quantum-mechanical experiment in which two photons are 
emitted in the directions of A and B by an atom between them. The photons are detected 
by polarizers; each a {(3) is taken to be +1 when one polarization is detected and -1 
when the other is detected. Unfortunately, at this point the paper becomes very unclear. 
Eberhard merely asserts without calculation that for each of the C's in Eq. ( |2.2| ), quantum 
mechanics predicts that "if the number of events N is large enough, then C = cos(2a — 
2b) ," where 2a — 2b is twice the angle between the polarizers. Actually, no approximation 
is necessary. For spin-1/2 particles, the correct prediction is 

C qm = 3cos9 — cos39, (2-10) 

which we derive below, and in which 9 is the angle between polarizers. (The result for 
photons will be the same if 9 is taken to be twice the angle between polarizers.) Note 
that for 9 = 45° ( j2.1U| ) gives C qm = 2 a/2 > 2. Therefore, quantum mechanics violates the 
CHSH inequality, just as it does the Bell inequalities. 

As mentioned above, the demonstration seems to assume almost nothing: no hidden 
variables, merely "locality," which implies that a certain mathematical quantity 7 always 
equals ± 2. However, on closer inspection we find that more than an assumption of inde- 
pendent a's and /3's is being made. In the first place, the value 2 on the right-hand side 
of Eq. (|2.8|) is entirely arbitrary and results merely from the choice of ±1 as the "eigen- 
values" for a and (3. One could have equally well chosen ±1000. In that case, however, 
one would necessarily have to assume that the corresponding quantum experiment also 
had eigenvalues of ±1000. This matter is not so serious, but it nevertheless illustrates 
that the CHSH inequality is not a purely mathematical assertion; a real measurement 
does lurk in the background. 

The central issue lies elsewhere. Eberhard's version of CHSH inequality is a statement 
about the statistical mean of 7, and therefore it does deal with a probability distribution 
over the 7. Moreover, the frequency that a particular 7 occurs is clearly taken to be 
positive. That probabilities should be positive-definite is usually regarded as self-evident, 
but because the assumption is the crux of the matter, we spend a moment examining it. 
(In the Appendix we detail where other authors have made the same assumption.) 

As mentioned, there are 16 possible combinations of aif3\ + ol\@2 + « 2 /?i — a 2 /? 2 (= 
7), of which eight have the value ±2 and eight have the value -2. In a sequence of N 
measurements, let us suppose that ±2 occurs n\ times and -2 occurs n 2 times, such that 
n\ + n.2 = N. Then 

C=^[m-n 2 }. (2.11) 

If all frequencies are equal, i.e. n\ = n 2 , then C = 0. If n 2 = 0, then C = 2 and if n\ = 
then C = —2. But here we have assumed that both n\ and n 2 are positive-definite. If 
n 2 < 0, then C > 2. In other words, the step leading to the second line in Eq. ( 2/7) is 
valid only when |n| = n. 

The notion of "extended" (non-positive-definite) probabilities has been considered by 
a surprising number of prominent investigators, but the majority of physicists continue 
to regard them with distaste, if not revulsion. Nevertheless, the quantum violation of 
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the bound on C is effectively due to the fact that quantum mechanics allows negative 
probabilities. In the next section we examine this claim in greater detail. 

3 Quantum Mechanical Probabilities 

Before deriving Eq.( |2.10| ), it will be helpful to summarize the procedure for obtaining 
the standard Bell Inequalities in order to point out similarities to the CHSH-Eberhard 
experiment. The reader is referred to SR or Sakurai for additional details; see also the 
Appendix. Like its successor, Bell's theorem is valid for local hidden-variable theories, 
which involve only classical probabilities. In a typical derivation such as Sakurai's one 
assumes that spin measurements may be made along any of three axes, a, b and c. A 
system of decaying atoms emits N particles of which a certain fraction are taken to be, 
say, of the type (a+, b+, c+) = (+ + +), which designates spin up along all three axes. 
To ensure zero total angular momentum, each emitted particle of type (+++) must be 

paired with one of type ( ). There are eight such spin combinations in all, as listed 

in Table 1. 

The probability that (+++) is emitted (and in the case of hidden variables, detected) 
is defined simply as P(+ + +) = iV(+ + +)/N. One can immediately object that such 
a probability is unphysical because to determine it requires three simultaneous spin mea- 
surements on a system of two particles, which is impossible. To eliminate this difficulty, 
one forms pairwise probabilities of the type P(a+,b+) = P(++), which represents the 
joint probability that the first particle will be found + along a and the second particle 
+ along b. This is easily done. From the table, the total number of particles such that 

the first particle's spin is + along a is N(-\ h) + N(-\ ), which must be paired 

with N( — I — ) + N( — h +), the total number of particles for which the second particle's 
spin is + along b. This combination is labeled N 3 + N$. Next one forms triangle-type 
inequalities such as 

N 3 + N 5 <(N 2 + N,) + (N 3 + N 7 ), (3.1) 

which is obviously true, since we have just added positive numbers to N 3 + N 5 . Dividing 
by iV gives by definition 

P(a+, b+) < P(a+, c+) + P(c+, b+), (3.2) 

one of the Bell inequalities. Eq. ( |3.2| ) involves only one measurement on each particle 
and so represents a physically realizable situation. Note that the "three-probabilities" 
P(+ + +) were reduced to pairwise probabilities P(++) by summing over the spins on 
the extraneous axis, in the above example c. We emphasize that, just as was the case 
for the CHSH inequality, the Bell inequality is valid only if the N's and hence the P's 
are taken to be positive-definite. In SR we demonstrated that one can form quantum 
probabilities P(+ + +), analogous to the classical probabilities, then sum over the third 
argument exactly as above to get pairwise quantum probabilities P(++) that violate (|3.2|) 
in the usual way. 

By this point the reader will have noticed a similarity between the 7's in Eberhard's 
experiment and the three-probabilities here. Authors who derive the generalized Bell in- 
equalities introduce 7 as a measure of correlations between real and imagined experiments 
but, as mentioned, if one takes it literally it amounts to having the apparatus knobs set 
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on two positions simultaneously. This would seem to represent the same sort of physical 
impossibility as that of making three simultaneous spin measurements on two particles. 
Indeed, we will demonstrate in Section |4] that the two procedures are identical: Introduc- 
ing an ensemble of hypothetical measurements is exactly equivalent to assuming a "master 
probability distribution" that requires more than two simultaneous spin measurements on 
two particles. Before doing so, however, we return to the Eberhard derivation. 

Eberhard's experiment involves four axes, ai, a2, bi, b2, rather than three, but other- 
wise is almost identical to the standard derivation of Bell's inequalities and so it is not 
surprising that the above procedure can be followed to demonstrate a violation of the 
CHSH inequality. We first need to compute the quantum pairwise probabilities of the 
type just mentioned, P(a+, b+). There are several ways to do this. Following SR, we 
write the quantum- mechanical projection operator for spin- 1/2 particles as 

II(a±) = -(1 ± cr • a). (3.3) 

In this equation we are representing the Pauli spin matrices as a vector, cr = ia x + j<7 y + ka z 
Thus cr • a = o x a x + <y y a y + cr z a z represents a traceless, 2x2 matrix and 1 is the unit 
matrix. Now, the expectation value of any operator O can be written < O >= Tr(pO), 
where p is the density matrix = diag(l/2, 1/2) for an initially unpolarized beam. The 
probability of finding the first particle in the + state along a is thus Tr(pII(a)) = 1/2. 
Similarly, the joint probability P(a+, b±) of finding the first particle in the + state along 
a and the second particle in the ± state along b is 

P(a+,b±) = ^Trn(a)n(b±) 

= ±Tr{(l + *.a)(l±<r.b)} 
= i(l±a-b). (3.4) 
Here, use has been made of the standard identity (see @) 

(ff'o)(ff'6) = (a-b)l + i(r(axb). (3.5) 

Because the Pauli matrix is traceless, taking the trace of ( |3.5|) yields 2a • b. 

Equation ( |3.4j ) is simply a sophisticated way of writing Malus' law. The first factor 
of 1/2 in (|3.4j ) gives the probability of detecting a particle in the + state along the a 
axis. The remaining factor 1/2(1 + a • b) = 1/2(1 + cos9), where 9 is the angle between 
polarizers. For photons(where 9 is taken to be the double angle) this then represents 
the usual decrease in intensity with cos 2 9. For a Bohm-type experiment, which assumes 
an (antisymmetric) spin-singlet state, one should choose the — on the right of (|3.4j )when 
computing P(a+, b+) to conserve angular momentum. With either sign, by inserting 
( p.4| ) into (|3.2|), it is straightforward to show that quantum mechanics violates Bell's 
inequalities. 

For the Eberhard experiment we take the knob settings ai,a 2 ,&i,&2 to represent the 
position of the polarizers on the measuring devices. Recall that his quantities C =< a(3 > 
represented the fraction of events in which a and f3 had the same sign minus the fraction 
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in which they had opposite signs, irrespective of whether an individual spin is + or — . 
Evidently the equivalent quantum expression is 1/2(1 + a • b) — 1/2(1 — a • b). Then 

C qm = ai ■ bi + ai • b 2 + a 2 • bi - a 2 ■ b 2 . (3.6) 

If the axes are chosen to be coplanar such that a x • b x = a x • b 2 = a 2 • b x = cosO and 
a 2 ■ b 2 = cos39, then (|3.6j ) gives exactly Q2.10 ), which violates the CHSH inequality for 
9 = 45°. 

The derivation of ( [2.10D just given involved only pairwise probabilities and did not go 
beyond standard quantum mechanics. With the projection-operator formalism, however, 
it is not difficult to write down the joint probability for four "simultaneous" spin mea- 
surements among four axes. An example would be P(+ + ++), in analogy to the classical 
three-probability mentioned earlier that appears in the derivation of Bell's inequality. 
Extending ( |3.4| ) to four arguments we take 

P(Aai, /ia 2 , i/bi, rb 2 ) = iTr{n(Aa 1 )n(/ia 2 )n(z/b 1 )n(rb 2 )}, (3.7) 

where A, fi, u, r are chosen as ±1 to represent up or down. For the symmetric case this is 

P(Aai,/ia 2 ,z/bi,rb 2 ) = ^-Tr{(l+A<x • ai )(l+fjuT • a 2 )(l+z/<x • b 1 )(l+r<r • 6 2 )} (3.8) 

We will need the antisymmetric expression later to make the subtraction just done above. 
Assuming that a measurement of + on knob a requires — on knob b, the antisymmetric 
case will be the same expression as ( |3.8| )with the signs on the b's reversed. We calculate 
only the symmetric case and state the results for the antisymmetric case as needed. 
Working out (|3.8|) and making frequent use of the identity (|3.5|) yields 



P(Aai, /ia 2 , z/bi, rb 2 ) = -'-{l + A/xai • a 2 + Az/ax ■ b x + \ra 1 ■ b 2 

16 

+/U^a 2 • bi + /ira 2 • b 2 + z/rbi ■ b 2 
+«A/iz/(ax x a 2 ) • b x + z\fj J r(a 1 x a 2 ) ■ b 2 
+%\uT{hi x b 2 ) • ai + «/ii/r(bi x b 2 ) • a 2 
+\u.vt [(a x • a 2 )(bi • b 2 ) + 2(a x x a 2 ) • (b x x b 2 )]}. (3.9) 

Notice that this expression is complex due to the imaginary elements of a y . If we desire a 
real result to eventually make contact with the usual quantum predictions, we can easily 
eliminate the imaginary terms. Note that n(Aa 1 )n(yua 2 )n(z/b 1 )II(rb 2 ) has been written 
in an arbitrary order; it is not symmetric in the arguments. There are 4! permutations 
of the arguments in this expression, twelve even and twelve odd. In ( |3.9| ) each imaginary 
term is a triple scalar product, which is invariant under even permutations and changes 
sign under odd permutations. Thus these terms vanish under symmetrization, as does 
the double cross product in the last line. The symmetrized version of ( |3.9|) is 

P(Aax, yua 2 , i/bi, rb 2 ) = — {1 + A/iax • a 2 + \vslx 4 bi + Ara x • b 2 

16 

+/iz/a 2 ■ bi + /ira 2 • b 2 + z/rbi ■ b 2 
+ ^\fiur [(ai • a 2 )(bi • b 2 ) + (a x ■ bi)(a 2 • b 2 ) + (a x • b 2 )(bi • a 2 )]}, (3.10) 
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which is entirely real. f] 

It is now easy to read off the various four-probabilities, P(+ + ++), P( ) etc. 

for each case merely by choosing the required signs of A, /i, is, r. The sixteen possibilities 
are listed for convenience in Table II. Note that these four-probabilities do sum to one 
and therefore in that respect behave as ordinary probabilities. However, although it is 
perhaps not evident from inspection, several of these probabilities can become negative. 

We plot P(+ + H — ) and P(H 1 — ) in Figure 1. The antisymmetric P's can be obtained 

from the symmetric ones merely merely by flipping the signs on the two 6's. 

From these four-probabilities one can form the quantity C qm in Eq. (|3.6|) in exact 
analogy to the procedure used for deriving the Bell inequalities. To compute P(a 1 +, bi+), 
for example, we only care that the first particle will be found + along ai and the second 
particle will be found + along bi. As before, we count all such possibilities by summing 
over the two extraneous arguments, &2 and b2. Thus, for the symmetric wavefunction, 

P(ai+,bi+) = P{+-+-) = P(+ + ++) + P(+ ++-)+P(+-+-)+P(+-++) (3.11) 

Reading off these P's from Table II and performing the sum yields 

^(l+ax-bO, (3.12) 

which is exactly Eq. ( |3.4| ). For the antisymmetric wave function one obtains 1/4(1 — 
ai ■ bi). Similar expressions are obtained for the other three pairwise probabilities. 
Clearly, subtracting the antisymmetric expressions from the symmetric ones and adding 



the four terms leads back to Eq. (|3.6| ) for C qm . This procedure must work because the 
four-probabilities are symmetric in all the arguments; summing over any of them produces 
an equal number of terms of opposite sign, which cancel out, leaving the usual quantum 
pairwise probabilities. 



4 Discussion and Conclusions 

We have shown that, like the Bell inequalities, the CHSH inequality assumes positive- 
definite probabilities and that quantum mechanics breaks both inequalities effectively be- 
cause it introduces negative weights to the measurements. These negative four-probabilities 
enter the derivation in exactly the same way as the classical three-probabilities entered the 
derivation of the Bell's inequalities. If they are unphysical, it is not necessarily because 
they are negative, but because it is impossible to make four simultaneous spin measure- 
ments on two particles. By the same token, it is impossible to make three simultaneous 
spin measurements on two particles. In any case, neither the classical three-probabilities 
found in Bell's theorem, nor the four-probabilities that figure here are actually measured. 
Both merely serve as "master distributions" from which to derive the usual pairwise prob- 
abilities, classical and quantum, which are both positive-definite. To reiterate our earlier 
remarks, from this point of view it is not surprising that the Bell and CHSH inequalities 



It is not actually necessary to symmetrize (13.91) . One can leave it as a complex expression, but when 



the sum over the extraneous arguments is performed as in (3.11), the imaginary terms cancel and the 
result will be entirely real, as before. However, the complex four-probability is not symmetric in the 
arguments. 
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are violated by experimental tests; they merely used the wrong set of probabilities for a 
quantum-mechanical problem. 

Although one might choose to reject negative probabilities as unphysical, one should 
not reject the notion of master probability distributions in favor of correlations between 
real and imaginary experiments because the two procedures are identical! Recall again 
that Eberhard's quantity Cn was C\\ = J2j=x ctijPij, which represented the fraction of 
events «i/3i that had the same sign minus the fraction that had opposite sign. Thus by 
definition we can write 

C n = P( ai +, bi+) + P(a x - bx-) - [P(ai+, b x -) + P( ai +, b x -)]. (4.1) 

Now, in exact analogy with the procedure of Section |3] we imagine that these pairwise prob- 
abilities can be derived from a master distribution involving all four axes ai,a2,b2,b3. 
In that case, as in Eq ( pip , P(++) = P(a 1 +,b 1 +) = P(+ + ++) + P(+ + +-) +P(+ - 



H — ) + P(+ H h), with analogous expressions for P( ),P(H — ) and P(— +). There 

are thus 16 terms that contribute to Cn, similarly for CyiiCix and C*22- Writing out all 
64 terms yields for C =< 7 >: 

C= 2{P(+ + ++) + P( ) + P(+ + +-) + P( +) 

+P(+ - ++) + P(- + ) + P(+ - -+) + P(- + +-) 

-P(+ + -+) - P(- - +-) - P(- + ++) - P(+ ) 

-P(+ + ) - P(- - ++) - P(+ - +-) - P(- + -+)} (4.2) 

These P's are general and may be taken to be either classical or quantum. Notice half 
enter with positive sign and half with negative. If all the probabilities are equal, then 
C = 0. If those that enter with negative sign are zero, then C = 2 and if those that 
enter with positive sign are zero, then C = —2. All this is in complete agreement with 
the analysis of Section 0. Clearly, if the P's are positive-definite then C < 2, but if the 
probabilities are allowed to become negative then this bound is violated. If the P's are 
assumed to be quantum, they take on the values given by Table II. In this case, inserting 
those values into ([12]) gives exactly ( |3lf) , as before. 



This demonstration shows clearly that the 7's can be derived from a master probability 
distribution which involves simultaneous spin measurements along four axes. The only 
difference between the classical and quantum cases is that in the former we assume the 
probabilities are positive-definite. The master distributions themselves cannot be regarded 
as any more or less meaningful than the space of hypothetical measurements, because the 
procedures are exactly equivalent. Indeed, we see that there is no difference between the 
Eberhard procedure and the usual derivation of Bell's inequalities. 

There remains the problem of interpretation. Most people insist that probability be 
defined in terms of relative frequency of events, in which case it must be positive-definite. 
In quantum mechanics, however, although one can define the expectation value in terms 
of the square of the wave amplitude, which corresponds to a relative-frequency interpre- 
tation, an alternate procedure is available. The expectation value may also be taken as a 
functional of the dynamical variables under consideration, for example position and mo- 
mentum. Classically, one might consider a Maxwellian distribution of particles in phase 
space; integrating over position or momentum would give the marginal probability distri- 
bution for the conjugate variable. But in quantum mechanics, the uncertainty principle 
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precludes precise simultaneous knowledge of noncommuting variables. If one attempts 
to associate a functional with a distribution over noncommuting variables, such that an 
integration over one of them gives the correct marginal distribution for the other, then 
one finds that the distribution function must in places become negative. This is the well 



known Wigner Distribution |]TT 



In the case of spin, the different components of angular momentum do not commute; 
hence no ordinary (positive-definite) probability distribution can be defined over the vari- 
ous components simultaneously. Any distribution will share with the Wigner distribution 
the property that it will become negative in some region of "phase space." For example, 
in the spin- 1/2 systems we have been considering, the probability of finding S z in the + 
state and S x in the + state is given by taking the trace of the product of the projection 
operators, as we have done earlier. Now, given a state with S x = +, the probability is 
1/2 for finding S z = +, and 1/2 for S z = —. Suppose, however, that many measurements 
show S z = +, always, but that S x = + appears with probability A and S x = — appears 
with probability 1 — A (0 < A < 1). The probability for finding S z = — must be then 
be (1/2)A + (1/2)(1 — A) = 1/2. On the one hand the probability of S z = — must equal 
zero. On the other hand, no mixture of S x = + and S x = — can give a zero probability 
for S z = —. 

This is quite a general property of noncommuting variables and has little to do with 
quantum mechanics. In such situations the best that one can ask for is that the probability 
distribution give the correct marginal distribution for one of the variables, in our case one 
component of angular momentum. This is what has been found in the present paper. The 
probability distribution for simultaneous measurements along three or more axes are not 
positive-definite, but the marginal distributions that give correlations between two spin 
components are, and are in accord with the standard predictions of quantum mechanics. 

The main point of this paper has been that assumptions beyond locality do enter into 
derivations of Bell's inequalities. It is worth mentioning yet another tacit assumption: 
that space is flat. The notion of parallel and antiparallel spins is only well defined for flat 
space where the measurement axes (the "z" axes) can be taken to be everywhere fixed 
relative to one another. In curved space there is no universal definition of parallel and 
one can only compare spins in distant locations by parallel transporting the measurement 
axes ||12|| . In the case of nonnegligible gravitational fields, then, the "nonlocal" EPR 
correlation between two particles, to the extent that they can be said to exist at all, must 
be the result of parallel transport, a local phenomenon. 

Returning to probabilities, we find ourselves in a strange situation. If one insists that 
probabilities remain positive-definite, we are forced to use vague and imprecise concepts, 
such as "local" or "nonlocal" to describe the outcome of the EPR experiment. On the 
other hand, we are able formulate the precise mathematical conditions necessary for the 
violation of the Bell and CHSH inequalities, although at the cost of introducing nega- 
tive probabilities. Most investigators would say that a unified, physical interpretation of 
negative probabilities is, in fact, exactly what is currently lacking. To be sure, Feynman 
conceded (see [|5| and [[13|]; also [[14], [15|])that all the results of quantum mechanics can be 
analyzed in terms of negative probabilities but he remained skeptical about the utility of 
such an approach and that a useful meaning could be attached to it. Nevertheless, many 
of the interpretational problems associated with negative probabilities stem from an in- 
sistence on viewing them within the framework of relative frequencies. This is clearly "no 
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go." We have shown that a more natural framework for their interpretation arises when 
one considers the expectation value as a measure of probability over noncommuting vari- 
ables. One can even go further than we have and consider complex probability measures 
(||16||), which also involve expectation values. Under such circumstances it is well to bear 
in mind that imaginary numbers are more similar to rotations than to real numbers. One 
should also bear in mind the very word "imaginary," an obsolete relic of their original 
status. 
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Appendix 

Many researchers appear unwilling to accept that any assumptions beyond locality are 
employed in the derivations of Bell's inequalities. We now list a few of the proofs we have 
found and point out explicitly where the assumption of positive probabilities enters. 

Bell 64- In Bell's original proof |]J he defines two quantities A(a, A) = ±1, B(b, A) = 
±1. He defines a normalized probability distribution p(A), such that / dX p(X) = 1. The 
expectation value of the spin components d\ ■ a and a 2 ■ b is 

P{a, b) = J dX p{X)A{a, X)B{b, A), (A.l) 

which he shows can be written (his equation 14) as 

P(a, b) = - J dX p(X)A{a, X)A(b, A). (A.2) 

When another vector c is involved, one has 

P(a, b) - P{a, c) = - JdX p(X) [A(a, X)A{b, A) - A{a, X)A{c, A)] (A.3) 

Bearing in mind that A(b, A) = 1/A(b, A) one can rewrite this as 

P{d, b) - P{d, c) = JdX p{X)A{d, X)A{b, A) [A(b, X)A{c, A) - 1] . (A.4) 
Bell then asserts 

\P(a,b) -P(a,c}\ < [ dXp(X)[A(b,X)A(c,X) -1], (A.5) 



where, of course, \A(a, X)A(b, A)| = 1. However, stricly speaking the triangle inequality 
gives 

\P(a,b)-P(a,c)\< JdX \p(X)\[A(b,X)A(c,X)-l], (A.6) 
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which is equal to (|A.5|) only when \p\ = p, i.e., when p > 0. 

CHSH. The CHSH paper makes the same assumption at the identical point in their 
derivation, in their first (unnumbered) equation. 

Peres. Peres' derivation M is almost identical to Eberhard's and makes the same 
assumption of positive weights in the same step, i.e. between steps 1 and 2 of Eq. (|2.7|) 
of this paper. 

Stapp 71. Stapp's 1971 proof [Jl0|] is very similar to Bell's. He arrives at an expression 
(below his equation 8) 

^<^EK^-M, (A.7) 

j 

where = ±1 and n' 2 j = ±1. He then shows this leads to the contradiction \[2 < 1. 
However, if the n's are ±1, then the summand can only have values 0,2. If Ni and N 2 
are the frequencies with which these two values occur, and Ni + N 2 = N, then the right 
hand side can be written 

V ix0+ I,x2l3 = ^.2(l-*). (A.8) 

As in the Eberhard argument, a contradiction can always be avoided by taking Aq nega- 
tive. 

Stapp 85. Stapp [[| establishes a contradiction by demonstrating (his Eq. 8) that 

-£V2^(A )+r Bi (A a )+rBi(A 6 ) >(v^_2) 2 , (A.9) 

where rAi(\ a ) = ±1, ^Bi(A a ) = ±1 and r^A;,) = ±1 . However, since the r's are ±1, the 
summand can have only one of three values: (v 7 ^) 2 , (2 + V2f and (2 - ^2f. Then the 
above expression can be written as 

- k( v / 2) 2 + n 2 (v / 2 + 2) 2 + n 3 (2- v^) 2 ! , (A.10) 
n 1 J 

where ni,n 2 , n 3 are the frequencies with which the three terms occur and rti +n 2 + n 3 = n. 
Squaring out and combining terms yields 

2(m + n 2 + n 3 ) | 2n 2 (2 + y/2) | 2n 3 (2 - yg) 
n n n 

Assuming n and n 3 positive, this expression can become negative if 

in other words, if n 2 is sufficiently negative. 

Bell 71. A proof that has been cited as qualitatively different than the others is Bell's 
1971 proof []17|]. This proof is basically the same as the CHSH proof. In Bell's 1971 
version the probability density is also explicitly taken to be positive definite. The only 
difference is that now \ A(a, A)| < 1 and \B(b, A)| < 1. (In our notation this corresponds to 
< 1 and < 1.) This change merely strengthens the upper bound on the classical 
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correlations. That is, in our equation Q2.5|) , whereas previously |7| = 2, now I7I < 2. The 
rest of the derivation is consequently unaffected and the CHSH inequality continues to 
hold. Furthermore, our demonstration of the equivalence of the Eberhard procedure with 
the "master probability distribution" procedure is also unaffected, since Eq. ( |4.2j ) made 
no assumption about the values of the P's. 
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TABLES AND FIGURES 

TABLE I. Spin combinations for standard Bell inequalities. Hidden-variable models 
assume that spin-1/2 particles can be emitted with ± spin along each of three axes, a, b 
and c. The notation (+ + +) etc., means spin up along all three axes. The eight possible 
spin combinations are shown. To ensure conservation of angular momentum, a particle 
of the type (+ + +) must be paired with one of ( ) and so on. 



Population Particle 1 Particle 2 





(+ + +) 




N 2 


(+ + -) 


( h 


N 3 


(+ - +) 


(- + - 


N 4 


(- + +) 


(+ - - 


N 5 


(+ - -) 


(- + + 


N 6 


(- + -) 


(+ - + 


N 7 


(- - +) 


(+ + - 


N 8 




(+ + + 
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TABLE II. Four probabilities. Shown are the four-probabilities from symmetric wave- 
function as computed from Eq. (|3.10|) . The quantity 

A = |[( ai ■ a 2 )(bi • b 2 ) + ( ai ■ bi)(a 2 ■ b 2 ) + ( ai • b 2 )(bi ■ a 2 )]. 

Note that these probabilities sum to one. The four-probabilities for the antisymmetric 
wave function can be obtained by flipping last two signs, i.e., P(+ + ++)as — P(+ + 
) s , P(- + ++) AS = P(- + ) s , etc. 



P(+ + ++) 


= P( ) 


= + 


a 2 


+ a x 


bi 


+ a x 


b 2 


+ a 2 


bi 


+ a 2 


b 2 


+ bx 


b 2 +A} 


P(- + ++) 


= P(+-—) 


= ^{l-ai 


a 2 


- ai 


•b! 


- ai 


■b 2 


+ a 2 


bi 


+ a2 


b 2 


+ b x 


b 2 -A} 


P{+-++) 


= P{- + —) 


= lei 1 ~ a i 


a 2 


+ a x 


bi 


+ a x 


b 2 


- a 2 


bi 


- a 2 


b 2 


+ bi 


b 2 -A} 


P(+ + -+) 


= P(--+-) 


= ^{l + ai 


a 2 


- ai 


bi 


+ a x 


b 2 


- a 2 


bi 


+ a 2 


b 2 


-b! 


b 2 -A} 


P(+ + +-) 


= P( — +) 


= + 


a 2 


+ ai 


bi 


- ai 


b 2 


+ a 2 


bi 


- a 2 


b 2 


-b x 


b 2 -A} 


P(+ + — ) 


= P(- - ++) 


= + 


a 2 


- ai 


bi 


- ai 


b 2 


- a 2 


bi 


- a 2 


b 2 


+ bi 


b 2 +A} 


P{+-+-) 


= P(- + -+) 


= ^{1 - a x 


a 2 


+ a x 


bi 


- ai 


b 2 


- a 2 


bi 


+ a 2 


b 2 


-b x 


■ b 2 +A} 


P{+--+) 


= P(- + +-) 


= ^{l-ai 


a 2 


- ai 


bi 


+ ai 


b 2 


+ a 2 


bi 


- a 2 


b 2 


- b x 


■ b 2 +A} 
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TABLE III. Four probabilities as functions of polarizer angles. Shown are the same 
four-probabilities as on Table II for the configuration ai • bi = ai • b2 = bi ■ a2 = cosO 
and a 2 • b 2 = cos39. Now A = l/3(cos 2 9 + cos 2 29 + cosO cos 39). With the identities 
cos29 = 2cos 2 9 — 1 and cos39 = Acos 3 9 — 3cos9 all the probabilities can be written in 
terms of one parameter, cos9 = C. This form makes it more plausible that some of the 
P's can become negative. 



P{+ + ++) 


= P(~ 




= -^{1 + 3cos9 + 2cos29 + cos39 + A} = ^{4G d + AC — 1 + A} 


P(~ + ++) 


= P(+ 




= ^{1 - cos9 + cos39 - A} = ^{4C 3 - AC + 1 - A} 


P(+-++) 


= P(- 


+ ) 


= ±{l + cos# - cos39 - A} = ^{-4C 3 + AC + 1 - A} 


P(+ + ~+) 


= P(~ 


- +-) 


= ^{1 - cos9 + cos39 - A} = ^{4C 3 - 4C + 1 - A} 


P(+ + +~) 


= P(~ 


- -+) 


= ^{1 + cos9 - cos39 - A} = ^{-4C 3 + AC + 1 - A} 


P(+ + —) 


= P(~ 


- ++) 


= ^{1 + 2cos29 - 3cos9 - cos39 + A} = ^{-4C 3 + C 2 - 1 + A} 


P(+-+~) 


= P(~ 


+ -+) 


= ±{1 - cos9 - 2cos29 + cos39 + A} = ^{4C 3 - AC 2 - AC + 3 + A} 


P(+--+) 


= P(~ 


+ +-) 


= ^{1 + cos9 - 2cos29 - cos39 + A} = ^{-4C 3 - AC 2 + AC + 3 + A} 



Hidden variables or. 



18 



FIG. 1. Four-probabilities from Table III. (a) Plot of 16P(+ + +-). (b) Plot of 
16P(H 1 — ). Note that these quantities become negative. 



